Novel approaches to Arabic speech recognition: report from the 2002 Johns-Hopkins Summer Workshop

نویسندگان

  • Katrin Kirchhoff
  • Jeff A. Bilmes
  • Sourin Das
  • Nicolae Duta
  • Melissa Egan
  • Gang Ji
  • Feng He
  • John Henderson
  • Daben Liu
  • Mohammed Noamany
  • Patrick Schone
  • Richard M. Schwartz
  • Dimitra Vergyri
چکیده

Although Arabic is currently one of the most widely spoken languages in the world, there has been relatively little speech recognition research on Arabic compared to other languages. Moreover, most previous work has concentrated on the recognition of formal rather than dialectal Arabic. This paper reports on our project at the 2002 Johns Hopkins Summer Workshop, which focused on the recognition of dialectal Arabic. Three problems were addressed: (a) the lack of short vowels and other pronunciation information in Arabic texts; (b) the morphological complexity of Arabic; and (c) the discrepancies between dialectal and formal Arabic. We present novel approaches to automatic vowel restoration, morphology-based language modeling and the integration of outof-corpus language model data, and report significant word error rate improvements on the LDC Arabic CallHome task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Large-vocabulary audio-visual speech recognition: a summary of the Johns Hopkins Summer 2000 Workshop

We report a summary of the Johns Hopkins Summer 2000 Workshop on audio-visual automatic speech recognition (ASR) in the large-vocabulary, continuous speech domain. Two problems of audio-visual ASR were mainly addressed: Visual feature extraction and audio-visual information fusion. First, image transform and model-based visual features were considered, obtained by means of the discrete cosine t...

متن کامل

Pronunciation Modelling for Conversational Speech Recognition: a Status Report from Ws97

Accurately modelling pronunciation variability in conversational speech is an important component for automatic speech recognition. We describe some of the projects undertaken in this direction at WS97, the Fifth LVCSR Summer Workshop, held at Johns Hopkins University, Baltimore, in July-August, 1997. We first illustrate a use of hand-labelled phonetic transcriptions of a portion of the Switchb...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

Mandarin Pronunciation Variation Modeling 1

1 This work was a report for the project “Mandarin pronunciation modeling” supported by the National Science Foundation of USA under Grant No. #IIS-9820687, and carried out in the 2000 Summer Workshop on Language and Speech Processing, Center for Language and Speech Processing, Johns Hopkins University (http://www.clsp.jhu.edu/ws2000/), and a report of its further research. Any opinions, findin...

متن کامل

Dialectal Chinese Speech Recognition : Final Report

†Richard Sproat, University of Illinois (Thomas) Fang Zheng, Tsinghua University Liang Gu, IBM Jing Li, Tsinghua University Yanli Zheng, University of Illinois Yi Su, Johns Hopkins University Haolang Zhou, Johns Hopkins University Philip Bramsen, MIT David Kirsch, Lehigh University Izhak Shafran, Johns Hopkins University Stavros Tsakalidis, Johns Hopkins University Rebecca Starr, Stanford Unive...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003